Building a Basic Resume Parser with Gemini Pro
A quick tutorial on leveraging Gemini Pro to find the best resumes in a sea of applicants
Created on February 23|Last edited on March 25
Comment
Recruiters and HR departments need to keep up with applications. And for some companies, especially those in dynamic, fast-moving spaces, this can be a daunting task. Open roles can net thousands upon thousands of applicants and careful review of each is frequently impossible.
Resume parsers are a great solution. Parsers are automated tools that extract key information from resumes and streamline the selection process, identifying candidates that need personal attention and removing those who don't have nearly the experience or skills required for the position. And while traditional parsers often fall short in handling diverse formats and nuanced language, Gemini Pro, a powerful language model from Google, is a promising solution.
This post will guide you through the exciting world of building your own resume parser with Gemini Pro. Let's start with a quick explanation of why Gemini Pro might be helpful here:
Why Gemini Pro?
Gemini Pro brings with it several advantages that make it ideal for resume parsing:
- Natural language processing (NLP) prowess: It understands the context and intent behind words, making it adept at handling diverse writing styles and formats.
- Flexibility: Gemini Pro integrates seamlessly with various platforms and tools, allowing for easy deployment and integration.
- Decoder-only transformer: Unlike traditional encoders that analyze input, Gemini Pro adopts a decoder-only architecture. This means it focuses on generating text based on the provided context, allowing for efficient processing and inference on specialized hardware like TPUs.
- Multimodal capabilities: While many models handle text alone, Gemini Pro shines in its ability to process various formats, including text, code, and images. This is achieved by representing each modality in a unified space, enabling the model to understand and generate across different domains.
- Mixture of experts (MoE): This innovative technique tackles the challenge of scaling large models efficiently. Essentially, MoE divides the model into smaller, specialized "expert" networks. Based on the input, only the relevant experts are activated, leading to significant performance gains and resource savings.
- Scalability and efficiency: Gemini Pro is designed for scalability. Its architecture allows for easy distribution across multiple machines, enabling it to handle large datasets and complex tasks. Additionally, MoE contributes to efficient processing and inference, making it suitable for real-world applications.

Install and Imports
We will use the PyPDF2 library for reading the resume pdfs and the Gemini Pro API for resume parsing. The code to install:
!pip install PyPDF2 -qimport google.generativeai as genaiimport osimport PyPDF2 as pdfimport json
Using W&B
Create a W&B account and install W&B using:
pip install wandb
...then login with:
wandb login
Model Config
genai.configure(api_key=os.environ.get('API_KEY')for m in genai.list_models():if 'generateContent' in m.supported_generation_methods:print(m.name)
Now we will setup our model function and the pdf loader function.
def get_gemini_repsonse(input):model=genai.GenerativeModel('gemini-pro')response=model.generate_content(input)return response.textdef input_pdf_text(uploaded_file):reader=pdf.PdfReader(uploaded_file)text=""for page in range(len(reader.pages)):page=reader.pages[page]text+=str(page.extract_text())return text
Generate the Prompt
We will now generate the prompt for the input to our Gemini Model which will consist of the instructions, resume text and the job description:
#Prompt Templateinput_prompt="""Hey Act Like a skilled or very experience ATS(Application Tracking System)with a deep understanding of tech field,software engineering,data science ,data analystand big data engineer. Your task is to evaluate the resume based on the given job description.You must consider the job market is very competitive and you should providebest assistance for improving thr resumes. Assign the percentage Matching basedon Jd andthe missing keywords with high accuracyresume:{text}description:{jd}I want the response as per below structure{{"JD Match": "%","MissingKeywords": [],"Profile Summary": ""}}"""
Inputs and results
jd = 'AI Researcher'text=input_pdf_text('resume.pdf')response=get_gemini_repsonse(input_prompt)
Wandb Logging
We can use wandb to log our text and the corresponding responses!
run = wandb.init(project="gemini resume")table = wandb.Table(columns=["Resume Text", "Job Description", "Response"])table.add_data(text, jd, response)# Log the completed table and finishwandb.log({"responses": table})wandb.finish()
Demo
It's a bit more enjoyable to play with this is real time, so please do check out to the live demo at this huggingface space!

References
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.